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Abstract 



Random scale-free overlay topologies provide a number 
of properties like for example high resilience against fail- 
ures of random nodes, small (average) diameter as well as 
good expansion and congestion characteristics that make 
them interesting for the use in large-scale distributed sys- 
tems. A number of these properties have been shown to 
be influenced by the exponent 7 of their degree distribu- 
tion P(k) ex k~'^ . In this article, we present a dis- 
tributed rewiring scheme that is suitable to effectuate scale- 
free overlay topologies with an adjustable exponent. The 
scheme uses a biased random walk strategy to sample new 
endpoints of edges being rewired and relies on the equilib- 
rium model for scale-free networks presented in M9'i . The 
bias of the random walk strategy can be tuned to produce 
random scale-free networks with arbitrary degree distribu- 
tion exponents greater than two. We argue that the rewiring 
strategy can be implemented in a distributed fashion based 
on a node 's information about its immediate neighbors. We 
present both analytical arguments as well as results that 
have been obtained using an implementation of the pro- 
posed protocol. 



1. Motivation 

During the last decade, the increasing spread and impor- 
tance of large-scale Peer-to-Peer systems has raised signif- 
icant research interest in the design and analysis of robust 
and efficient overlay networks. In this research, structured 
and unstructured approaches can be distinguished. Mim- 
icking the use of data structures in traditional computing, 
highly structured overlay topologies facilitate the use of 
efficient distributed algorithms with deterministic perfor- 
mance. However, the overhead entailed by the construc- 
tion and maintenance of such deterministically structured 
topologies questions their usability in large-scale scenarios 



with dynamic and potentially faulty participants. Consti- 
tuting a different approach, unstructured overlay networks 
do not impose constraints about the detailed structure of 
the emerging network topology. Rather than using costly 
and potentially complex routines for building and maintain- 
ing sophisticated network structures, in such unstructured 
overlays links can arise in a seemingly random and unco- 
ordinated fashion. They are thus particularly suitable for 
highly dynamic scenarios in which the operational overhead 
entailed by structured approaches can possibly dominate a 
system's overall performance |6|. 

While the use of unstructured overlays can reduce con- 
struction and maintenance overhead, designing efficient dis- 
tributed algorithms with predictable performance is hardly 
possible when making no assumptions whatsoever about 
an overlay's structure. Interestingly, based on a stochas- 
tic model of the system in question and arguments from 
random graph theory and complex network science, it is 
often possible to reason about structural properties of the 
resulting network topology that hold almost surely in the 
limit of large systems. Similarly, the performance of a num- 
ber of dynamical processes - many of them relevant to dis- 
tributed computing systems - has been studied in random 
network structures. For sufficiently large systems, based on 
randomized overlay topologies one can thus obtain strong, 
though probabilistic guarantees about their structure and 
performance. Considering the classical taxonomy of deter- 
ministically structured and completely unstructured overlay 
networks, this suggests an intermediate class of probabilis- 
tically structured topologies that promises to combine the 
benefits of both. 

During the last decade, much of the work in the field of 
random networks has been focused around scale-free net- 
works that are characterized by a power law degree distri- 
bution P{k) (X k~'^ . The fact that networks with such scale- 
free characteristics seem to emerge naturally in a variety of 
natural, social and technological contexts has awakened the 
interest of researchers in disciplines as diverse as mathemat- 



ics, statistical physics, biology, sociology, and computing. 
It has since been shown that scale-free networks provide a 
number of interesting properties like a remarkable robust- 
ness against random failures B1 fT4ll . small diameter and av- 
erage path lengths 1 12l|9l as well as favorable expansion and 
congestion properties IITtI l34ll . Some of these properties 
make them interesting for the design of large-scale comput- 
ing systems and - in fact - for certain networked comput- 
ing systems it has been observed that scale-free structures 
emerge in a seemingly self-organized way 13] [15] |28] |30l . 

Based on this observation, during the last couple of 
years, the performance of distributed algorithms operating 
in scale-free networks has been studied. For the problems 
of distributed search ||T||27][l^, information dissemination 
and entropy reduction protocols [23 1 as well as synchro- 
nization (321, distributed schemes have been derived that 
seem to work particularly efficiently in scale-free networks. 
Considering a scale-free network topology with a degree 
distribution P{k)^'^, it has further been argued that the ex- 
ponent 7 has massive influence on network properties like 
diameter 1.12.1 . the vulnerability against targeted attacks as 
well as the performance of dynamical processes fSJ. The 
reason for this can be found in the fact that the exponent 
7 determines the skewness of the degree distribution and 
thus the frequency and magnitude of highly connected hub 
nodes. For the practical design of scale-free overlay topolo- 
gies, the degree distribution exponent is thus a critical pa- 
rameter which largely influences their robustness, the per- 
formance of distributed algorithms as well as the distribu- 
tion of load being imposed on individual machines. 

In this article, we study how systems with scale-free 
overlay structures can adapt the degree distribution expo- 
nent and thus tune the heterogeneity of overlay connectivity 
in a distributed and directed fashion while maintaining the 
overall scale-free characteristics of their topology. Based 
on analogies to equilibrium and non-equilibrium statisti- 
cal physics that have been put forth in the study of com- 
plex networkslJl [161 and the fact that - in the limit of large 
systems - a number of network properties change abruptly 
when the degree distribution exponent exceeds certain crit- 
ical points, one may view such a mechanism as an active 
triggering of network phase transitions fTT]. In this arti- 
cle, we discuss a simple distributed rewiring mechanism 
that can be used for this purpose in non-growing overlay 
topologies. It is based on the equilibrium model of un- 
corrected scale-free networks that has been considered in 
|fT9l l26l and makes use of a biased random walk strategy 
in order to sample the endpoints of edges being rewired. 
As we shall see later, the efficiency and thus feasibility of 
the mechanism is based on the favorable expansion prop- 
erties of certain classes of random networks. A detailed 
description and derivation of the proposed protocol will be 
presented in sectionl2] Here we further present some analyt- 



ical arguments on the convergence behavior of the random 
walk sampling strategy underlying the protocol being pre- 
sented in the subsequent section [3] In section [4] we present 
simulation results that have been obtained using an imple- 
mentation of the proposed rewiring scheme. Having briefly 
reviewed related work, in section [6] we conclude the article 
by summarizing its contribution and pointing out a number 
of open issues and threats to validity. 

2. Creating and Adapting Scale-Free Overlays 

As has been argued above, the exponent 7 is a macro- 
scopic, statistical parameter that influences the structural 
properties of random networks with a power law degree dis- 
tribution P{k) ex k *. In the following, we thus intend 
to derive a distributed protocol that can be used to effectu- 
ate scale-free network topologies with a particular degree 
distribution exponent. As initial situation, we assume an 
arbitrary, connected overlay topology. While for the func- 
tioning of the scheme no particular initial state of the over- 
lay is required as long as it is connected, it will later be 
argued that the initial topology influences the efficiency of 
the scheme in terms of the number of messages that need to 
be exchanged. For simplicity, we further assume that each 
of the n nodes is uniquely identified by a numeric identi- 
fier i e {1, . . . ,n}. However in sufficiently large systems, 
per-node quantities i that are chosen uniformly at random 
- and thus not necessarily unique - can be used instead. In 
order to simplify derivation and analysis, we further con- 
sider a static situation in which no nodes enter or leave the 
system. Clearly, the main motivation to use a probabilistic 
overlay topology in the first place is to support highly dy- 
namic systems in which node joins and exits are frequent. 
Although in this article we only consider the simpler static 
situation, we argue that our results can readily be extended 
to dynamic situations with fluctuating participants. 

In order to derive a distributed scheme that can be used to 
influence the structure of scale-free overlays, we first need 
a model that is capable of generating network topologies 
with tunable power law degree distribution exponents. Here 
we use a simple equilibrium model for scale-free networks 
with a fixed number of nodes that has been introduced in 
|[T9l and analyzed in li26l . In this model it is assumed that 
each node i e {1, . . . , n} is assigned a weight Wi ~ i^" for 
some parameter a in the range (0, 1). It is then assumed that 
m edges are created between pairs of nodes {i,j) chosen 
with probabilities pi and pj that are given by the normalized 
weights 
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As has been argued in 0261 . this simple model produces 
uncorrected random scale-free networks with a degree dis- 



tribution 

P{k) oz k-^^+i^ 

Hence, for a — > 0, the model yields a scale-free network 
with the exponent 7 — > oo, while for a — >■ 1 the exponent 7 
converges to two. Hence, it provides a parameter that can be 
adjusted to effectuate scale-free networks with an arbitrary 
degree distribution exponent 7 in the range (2, 00). 

In order to apply this simple model in practical net- 
worked systems, a distributed scheme is required that cre- 
ates edges between two nodes i and j in an overlay net- 
work with probability PiPj. For this, we assume that we 
start with a random, connected overlay topology consisting 
of n nodes and ni edges. In practice, this initial topology 
may emerge by means of an arbitrary bootstrapping method 
that connects joining nodes to existing participants either 
deterministically or at random. We can then view the above 
model as a rewiring scheme that gradually replaces existing 
edges so that edges between node pairs emerge with the de- 
sired node-dependent probabilities. For this, a node initiat- 
ing the rewiring of an edge must be able to sample two new 
endpoints for the edge being rewired according to the prob- 
ability measure given in equationfl] While one can imagine 
different mechanisms by which this can be achieved Il24l . 
a simple and promising method to sample nodes in Peer- 
to-Peer systems is by means of a random walk 1361 [TSl . 
For this we consider that nodes wishing to rewire an edge 
sample two new endpoints by means of two independent 
random walks through the current network topology. For 
a classical, unbiased random walk, the probability TTi{t) to 
find the walker at an arbitrary time t at node i converges to 



at node i moves to node j. The fact that this transition ma- 
trix has stationary distribution vr follows from the reversibil- 
ity of the underlying Markov chain, as well as from its ir- 
reducibility (assuming a connected network topology) and 
aperiodicity (self-loops are possible). Under these restric- 
tions, the Markov chain convergence theorem ensures that 
the probability Tri{l) to find a random walker that has been 
started in an arbitrary node resides at node i after I steps 
converges to tt as ^ goes to infinity. 

From this, one can easily configure a random walk bias 
that results in a stationary distribution suitable to sample 
nodes in a way that - after rewiring - a scale-free network 
with degree distribution exponent 7 emerges. From the 
probability pi in equation [T] and the fact that it gives rise 
to a scale-free network with degree distribution exponent 
1 + -, we obtain the desired stationary distribution 
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which, with equation 2 and ^ 
following transition matrix P: 
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Thus, a random walk with the above bias can be used to 
sample endpoints of edges and thus perform rewiring oper- 
ations that effectuate scale-free network topologies with a 
particular degree distribution exponent. 
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where d is the average node degree of the network. In 
order to sample nodes with the probabilities given in equa- 
tion [T] we need to introduce a random walk bias that influ- 
ences the transition probabilities accordingly. Considering 
a random walk in a connected overlay topology G{V, E) as 
Markov chain with state space V and stationary distribution 
TT, the random walk bias can be configured by means of a 
Metropolis-Hastings chain ll29l 1211 |5l in such a way that a 
desired stationary distribution tt holds. In general, this can 
be achieved by introducing a bias as shown in the following 
transition matrix T: 
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P^k 1^3 (2) 



Here di denotes the current degree of node i ^ V and an 
entry Tij gives the probability that a random walk residing 



2.1. Bounding the Random Walk Length 

The goal of this article is to practically apply the above 
strategy in a distributed rewiring scheme. Hence, an im- 
portant question that needs to be answered is how many 
steps a random walk with the above bias needs to take be- 
fore the probability Tri{l) to find it in a node i after / steps 
is sufficiently close to the desired stationary limit ttj. In 
the rewiring protocol presented in the following section, 
this translates to the number of messages that need to be 
exchanged for a single rewiring operation. To assess this 
convergence behavior, one first needs to give a formal def- 
inition of when two probability distributions tt and tt' shall 
be considered sufficiently close. For this we use the usual 
definition of the total variation distance D which - for two 
probability measures tt and tt' and a finite state space y - is 
defined as follows: 
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The configuration of the random walk bias according to 
equationffland the Markov convergence theorem ensure that 
D{tt{1), tt) — > for I — > oo. For an arbitrarily chosen total 
variation distance e > we can then assess the number of 
steps I our random walk needs to take until D{tt{1), tt) < e. 
In order to bound the minimally required number of steps 
I, the arguments put forth in |[33l can be used. Here it is 
argued that an upper bound for / is given by 
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where tt is the stationary distribution of the Markov 
chain, X2{P) is the second smallest eigenvalue of the tran- 
sition matrix P and s is the initial state. Thus, finding 
an upper bound for the number of random walk steps re- 
quires to find a lower bound for the second smallest eigen- 
value A(P) of the transition matrix. Unfortunately, obtain- 
ing good bounds for the eigenvalues of stochastic matrices 
is a non-trivial task. Nevertheless, based on the canonical 
path approach introduced in lfT3l[33l . analytical arguments 
concerning the convergence behavior of random waUcs with 
a Zipf stationary distribution have been put forth in 13611371 . 
In the following we briefly repeat these arguments for the 
particular random walk strategy considered in this article. 
In 1371 it has been argued that, if the stationary distribution 
TT is highly skewed, a lower bound for the eigenvalue gap 
1 - |A2(P)| is given by 

1-|A2(P)|> 



Here D denotes the diameter of the network topology 
upon which the random walk operates, 7r,„i„ is the mini- 
mum probability ascribed to any vertex by the stationary 
distribution and dmax is the maximum degree of any vertex 
in the network. Thus, for the special case of Zipf station- 
ary distributions, an asymptotic upper bound for the random 
walk length I required to achieve a total variation distance 
smaller than e is given as Ii33l[37ll : 
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For a random walk strategy configured to eventually ef- 
fectuate a degree distribution exponent 7 and thus stationary 
distribution tt'', for the inverse stationary probability of the 
starting node s, the following bound holds: 
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While this holds for arbitrary 7 e [2, 00) and starting 
nodes s, for the special case of node n we can give a better 
bound by observing that - due to the increasing skewness - 
node n is ascribed minimal probability for 7 = 2, that is for 

7e [2,00) 



.7=2 
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holds. With this, we can bound the inverse minimal prob- 
ability by considering the logarithmic growth of the har- 
monic series, so that 

1 1 "1 

-^ — < — :^^2 ^ '^'X! k ^ "'^" = n-(/n(n) + T + r„) 

'^min '^min k=l 

where r denotes the Euler-Mascheroni constant and 
r„ — >^ in the limit of large n. Assuming an initial scale- 
free topology with n nodes and degree distribution exponent 
7i allows to asymptotically bound diameter and maximum 
degree as 0{ln{n)) and 0(71^*) respectively 137]. Thus, 
for large n and a random walk started in node s, an asymp- 
totic upper bound for the minimal length I to achieve total 
variation distance smaller than e can be given as follows; 
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This theoretic bound scales worse than linear with the 
network size n. However, as has previously been observed 
e.g. in {Tf\, the underlying bounding technique is not nec- 
essarily tight, that is the actual convergence behavior of a 
random walk can be considerably better. Since at present 
obtaining tight upper bounds for the convergence of Markov 
chains in complex network topologies is an open research 
issue, in section |4] we present simulations that have been 
performed to derive practicable random walk lengths em- 
pirically. As will be argued later, the results of these simu- 
lations suggest that the adaptation scheme presented in this 
article can be practically implemented with reasonable ran- 
dom walk lengths. Although these results suggest that the 
analytical bounds shown above are not tight and thus unin- 
formative with respect to the performance of the scheme in 
practice, they can nevertheless be used to study by which 
parameters the convergence behavior of a random walk is 
influenced. From equation l6] one can for example infer 
that the upper bound for the minimal random walk length 
will generally be higher when wanting to effectuate highly 
skewed scale-free networks with exponents close to two. 

3. Protocol Definition 

The arguments laid out in the previous section suggest 
a rewiring protocol that consists of the following three ba- 
sic operations: (1) In periodic intervals, a node a selects an 
edge to a random neighbor b that has not yet been rewired. 
(2) A random walk with the bias presented in equation|4]is 
started to sample two nodes x and y with probabilities pro- 
portional to TTj. and TTy respectively. (3) The edge {v, w) is 
replaced by the edge {x, y) and the latter is marked as hav- 
ing resulted from a rewiring operation. After all m edges 



of the overlay have been rewired, a scale-free overlay is ob- 
tained whose exponent depends on the particular choice of 
the random walk bias defined in equation |4] In the algo- 
rithms [1] - HI we give a detailed algorithmic description of 
the protocol. In these algorithms, dy denotes the degree of 
node V, i^ is the ID of node v and self denotes the node at 
which the code is executed. We further assume that nodes 
have information about the IDs and the degrees of their 
nearest neighbors. 

The detailed algorithm of the main program loop that 
is responsible for initiating random walks is shown in al- 
gorithm [T] Rewiring operations are initiated by nodes in 
regular intervals only for those edges that have not yet been 
rewired. By this means, at most m rewirings are performed 
where m is the number of edges in the initial random net- 
work topology. The number of rewiring operations and thus 
message transfers taking place within a certain time interval 
can be adjusted by choosing an appropriate (network-size 
dependent) delay value. When a node with an unmarked 
edge wakes up, a rewiring operation is initiated. In order to 
prevent both endpoints of an edge to initiate rewiring oper- 
ations for the same edge, rewirings are only started by the 
node with higher degree or - if the degrees are equal - by the 
node with the smaller ID. As we shall see later in section 
|4] the choice of letting a rewiring be initiated by the bet- 
ter connected endpoint can improve the performance of the 
scheme. To find the endpoints of a new edge by which the 
previously unmarked edge shall be replaced, a node initiates 
a biased random walk through the overlay (lines 6 — 11). In 
order to retain connectedness and prevent nodes from being 
isolated we further assume that only edges from nodes with 
degree greater than 1 are rewired. 

Algorithm 1 Main Loop 

1 

2 
3 
4 

5: if d„ > 1 && dsslf > 1 && {dself > dn II (dself = dn 

8c&.iseif < «„)) then 

6 

7 

8 

9 

10: 
11 
12: 
13: 
14: 



loop 

Sleep{delay) 

it neighbors. Count > marked. Count thea 
n = RandomUnmarkedNeighbor() 

ii dn > 1 && dself > 1 && {dself > dn II (dself 

&&iseif < «„)) then 
{Initiate random walk) 
msg.Hops <— 
msg.a <— self 
msg.b <— n 
msg.target <— null 
Send{{walk, msg}, n) 
end if 
end if 
end loop 



When a node v receives a random walk message, it needs 
to ensure that the message is forwarded with the bias given 
in equation!?] In algorithml2] this is done in lines 14 — 21. 
Comparing the algorithm with the stochastic matrix P de- 
fined in equation]?] here we select a neighbor uniformly at 
random and draw a random value uniformly in the inter- 
val [0, 1] that indicates whether the random walk transitions 
along this edge or whether it stays in the current node. One 



can imagine different schemes by which the two endpoints 
V and w of the new edge {v, w) are sampled. The node initi- 
ating the rewiring could for example start one random walk 
for each endpoint of the new edge, collect the target nodes 
of both walks and connect them to each other. In order to 
simplify the implementation, in algorithms JT] and ]2] we pro- 
pose to sample both endpoints of the new edge in a single 
random walk of length 21, assuming that after I steps, the 
node at which the random walk currently resides is stored 
in the field target of the message being forwarded. By 
this means, all information related to a rewiring operation 
is stored in the random walk message. Hence the node at 
which the random walk arrives after 2/ steps has all infor- 
mation necessary to initiate the rewiring operation. For this, 
it creates a connection to the target node stored in the mes- 
sage while initiating the deletion of the edge between node 
a that has started the random walk and its neighbor b. As 
can be seen in algorithm ]4] a disconnection requires - apart 
from removing the edge - no further action at the side of the 
node from which the edge is removed. As shown in algo- 
rithm ]3] both endpoints of the newly created edge mutually 
mark each other in order to prevent it from being rewired 
again in future invocations of the protocol. We emphasize 
that this is to prevent unnecessary rewiring operations and 
thus message exchanges rather than being required for the 
functioning of the protocol. 

Algorithm 2 Node receives {walk, msg} 



9 

10: 

11 

12: 
13: 
14: 

15: 

16: 

17: 
18: 
19: 
20: 

21 
22: 



msg.Hops •<— msg.Hops + 1 
if msg.Hops = I then 
{Store Endpoint) 
msg.target •<— self 
else it msg.Hops = 21 then 
{Rewire) 

if \neighbors.Contains(msg.target) . 
self then 

Send{{disconnect, msg.a}, msg.b) 
Send{{disconnect, msg.b}, msg. a) 
Send{{connect, sel f} , msg.target) 
Send{{connect, msg.target}, self) 
end if 
else 

n <— sel f .RandomN eighbor 

it random. NextQ < %j^ (%5r)^°^ 

{Forward Random Walk) 

Send{{walk, msg} , n) 
else 

{ Self-Loop } 

Send{{walk, msg} , self) 
end if 
end if 



msg.target 7^ 



then 



Algorithm 3 Node receives {connect, y} 

1: neighbor s.Add{y) 
2: marked. Add{y) 

Concluding the description of the proposed protocol, we 
consider the size and number of messages that need to be 
sent across the network. Sampling the two endpoints of the 



Algorithm 4 Node receives {disconnect, b} 

1: neighbors. Remove{b) 

new edge requires at most 21 messages [4 where I is the 
number of steps taken by a single random walk to sample 
a node with a probability sufficiently close to the stationary 
distribution tt. Once both endpoints of the new edge have 
been found, the rewiring requires two messages to discon- 
nect nodes a and b and one message to connect to the node 
target that has been stored in the random walk message. 

Since the IDs of the initial node, its neighbors and the 
intermediate target, as well as the current hop count need to 
be stored in the random walk message, the required number 
of bits for a message is logarithmic in the number n of nodes 
in the system. Thus, the number of bits that need to be trans- 
ferred per rewiring operation is 0{l ■ log{n)). Since exactly 
one rewiring operation is executed for each of the m edges 
in the overlay topology, the total number of bits that need to 
be transferred in order to create a scale-free topology with 
the desired exponent is 0(to • ^ • log{n)). We further require 
to store one additional bit per edge, indicating whether an 
edge has previously been rewired or not. 

4. Evaluation 

Having given a description of the rewiring protocol as 
well as analytical arguments about its convergence behav- 
ior, in this section we present simulation results that have 
been obtained using an implementation of the proposed 
scheme. This evaluation is split up in two parts. In a first 
step, we seek to establish by simulation a practicable lower 
bound for the minimally random walk length /. We further 
study the influence of the initiating node's degree on the 
convergence time of a random walk. Based on these results, 
in a second step we then simulate the rewiring protocol and 
study its influence on a network's degree distribution. 

4.1. Minimum Random Walk Length 

While theoretic asymptotic upper bounds for the re- 
quired number of steps I of the random walk have been pre- 
sented in section [2T| here we empirically study the conver- 
gence behavior for a number of random walk lengths. By 
this we intend to derive a practicable random walk length 
that represents a reasonable trade-off between the imposed 
number of messages and the resulting total variation dis- 
tance. We further intend to investigate how the minimum 
random walk length changes as the network sizes is var- 
ied. The following results have been obtained as follows. 



In each simulation run a number R of random walks was 
started from a randomly chosen node in a random network 
topology. For each simulation of a random walk of length 
I, a hit counter was increased in the node at which the ran- 
dom walk resided in the l-\h step. When R random walks 
had been simulated, the total variation distance was com- 
puted based on the observed hit frequencies and the station- 
ary distribution expected for the chosen random walk bias. 
Depending on network size and the minimum probability 
TTmin of the Stationary distribution, the number of random 
walk iterations R was chosen in a range between 10^ and 
lO'*. In particular, it was chosen such that nodes with min- 
imum stationary probability 7r„ij„ were expected to be hit 
reasonably often to argue about the total variation distance. 
The above procedure was then repeated for different ran- 
dom network realizations and starting nodes. Finally the 
minimum, maximum and average total variation distance of 
a simulation run was computed and the procedure was re- 
peated for different network sizes, random walk biases and 
random walk lengths. 

Figure [T] shows the random walk length I minimally re- 
quired for the average total variation distance to fall below 
e = 0.05 in scale-free Barabasi-Albert (BA) networks ran- 
domly generated by the preferential attachment scheme pre- 
sented in |7|. Results are shown for different network sizes 
and for random walks configured to effectuate three dif- 
ferent exponents 2.1, 2.5 and 3.5. Rather than the linear 
scaling behavior suggested by the theoretical upper bound 
presented in section 2.1 the observed required length I 
rather scales in a sub-linear fashion. The observation that 
the actual convergence behavior is significantly better than 
the theoretical bound that can be obtained by a canonical 
path approach is consistent with the observations presented 
in ll35l and indicates that the rewiring scheme can be im- 
plemented efficiently in practice. Further simulation re- 
sults that have been obtained for Erdos/Renyi (ER) random 
graphs indicate that the number of steps required to achieve 
a total variation distance smaller than 0.05 are in the same 
range as those for random power law graphs. Informally, 
this observed fast convergence can be attributed to the good 
expansion properties and the small diameter of both classi- 
cal random graphs and random scale-free networks. 

In networks with highly heterogeneous connectivity, a 
further interesting question is how the choice of the starting 
node of a random walk influences the total variation dis- 
tance that can be achieved by a fixed random walk length. 
To investigate this, a number of random scale-free BA net- 
works was created and a large number of random walks was 
started from each node of the networlr] The frequency with 
which nodes were target of random walks was recorded and 
the resulting total variation distance to the expected station- 



At most 21 since self-loops are allowed to ensure aperiodicity of the 
underlying Markov chain. While a self-loop is a hop of the random walk, 
it does not entail a message exchange. 



^Here large again means sufficiently large to reasonably compute the 
total variation distance. 
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Figure 1: Minimum random walk length I required to achieve 
D(7r(/),7r) < 0.05 in Barabasi-Albert networks with 
random walk biases configured to effectuate exponents 
2.1, 2.5 and 3.5 (Lines are drawn to guide the eye) 

ary distribution was computed for each starting node indi- 
vidually. Figure [2] shows the relation between the degree 
of initial nodes and the total variation distance that was 
achieved in a representative simulation in random Barabasi- 
Albert networks with 1000 nodes, / = 5 and a random walk 
bias to effectuate 7 = 3. Results suggest that random walks 
provide on average better convergence behavior when being 
started in highly connected nodes. In fact this is a rather in- 
tuitive result since a random walk starting at a high degree 
node can potentially reach a large number of nodes even in 
a single step. In the protocol presented in l3] this observa- 
tion justifies the choice that rewiring operations for an edge 
{i,j) are initiated by the node with higher degree. 

20 40 60 80 100 120 140 




20 40 60 80 100 120 140 
Node Degree 

Figure 2: Correlation between degree of starting node and aver- 
age D{tt{1),ti) in 1000 node Barabasi-Albert networks 
with 7 = 3 and Z = 5 

4.2. Degree Distribution 

We now turn to the question how the rewiring protocol 
described in section [3] influences the degree distribution of 



a network topology. All results presented in the follow- 
ing figures have been obtained for networks consisting of 
5000 nodes and roughly 25000 edges. Initial topologies 
upon which the protocol was started were created using 
the Barabasi-Albert preferential attachment model as well 
as the Erdos/Renyi model for classical random graphs. For 
the following measurements we used a random walk length 
that was long enough to achieve an average total variation 
distance smaller than e = 0.05. Based on the results pre- 
sented in the previous section a random walk length / = 20 
was chosen. 

In each simulation run, the protocol presented in al- 
gorithm l3] was applied by all nodes until all edges were 
rewired. The delay interval between individual rewiring it- 
erations was chosen such that- on average - a single rewiring 
took place per time unit. Thus, for the chosen network size 
and rewiring intensity, an adaptation cycle is expected to 
be completed after roughly 25000 units of simulated time. 
The degree distribution of the network topology was com- 
puted each 200 time units and a fit to the cuiTent degree 
distribution exponent was performed. For this, an R imple- 
mentation of the maximum likelihood power law fit proce- 
dure described in 1 1 1 1 was used. This procedure yields the 
fitted degree distribution exponent 7/ that holds with maxi- 
mum likelihood, the minimum network degree dmin above 
which the fit holds, as well as the Kolmogorov-Smirnov 
(KS) statistic D. In general, better fits result in smaller val- 
ues of D, thus allowing to evaluate whether the "power law 
nature" of the degree distribution is strengthened or fades 
away under the application of the rewiring scheme. All 
results are averages of at least 5 independent applications 
of our protocol on randomly chosen network realizations 
of identical size. Simulation code, data analysis scripts, 
datasets, simulation videos as well some further graphical 
representations of results that could not be included in this 
article are available on the author's website. 



Figures 3(a) and 3(d) show the effect of the proposed 



protocol on the degree distribution of a network that was 
initially created by the Barabasi-Albert (BA), respectively 
Erdos/Renyi (ER) model. For BA networks, the average 
fitted exponent of the initial topology was on average 2.9, 
while for ER networks the used fitting procedure yielded 3.5 
with an at least 10-fold value of the KS-statistic D. The re- 
sults suggest that the protocol does lead to an adaptation of 
the degree distribution exponent of the overlay topology. In 
particular, the evolution of the Kolmogorov-Smirnov statis- 



tic D that is shown in Figure 3(b) demonstrate that the scale- 
free characteristic of BA networks is preserved. The in- 
crease of the minimum degree above which the fit holds 
can be explained by the exponent-dependent finite-size ef- 
fects in scale-free networks. For Erdos/Renyi networks, the 
roughly 10-fold decrease of the KD-statistic D that can be 



seen in 3(e) indicates the emergence of scale-free charac- 



teristics, that is the power law fit to the degree distribution 
becomes more reliable. In Figures 3(c) and 3(f) the evo- 
lution of the average maximum degree is shown. The re- 
sults are consistent with the maximum degree expected in 
networks of the given size and with different degree distri- 
bution exponents. In Figure [5] the average fit parameters 
for the network topology eventually reached after adapta- 
tion are shown. The results demonstrate that - as expected 
from the underlying equilibrium model - the protocol can be 
applied to transform arbitrary initial topologies into scale- 
free networks whose degree distribution is described by a 
power law with an exponent reasonably close to the in- 
tended value. 

So far, we have only studied simulations using a sin- 
gle "cycle" of the proposed adaption protocol. In Figure 
HI results are shown for a simulation in which three adapta- 
tion cycles targeting different exponents were subsequently 
initiated in a Barabasi/ Albert network with 10'' nodes and 
roughly 5 • lO'' edges. The chosen random walk length of 
I = 22 was again consistent with the values found in sec- 
tion 4. 1 In Figure 4(a)|4(c) time steps in which adaption 
cycles were started are indicated by vertical lines. The tar- 
geted degree distribution exponents were 2.9, 2.1 and 3.5 
respectively. Again the results indicate that the proposed 
scheme achieves the desired adaptation. Furthermore, Fig- 
ure 4(b) shows how the "power law nature" of the degree 
distribution - and thus the scale-free characteristic of the 
network - temporarily fades during the adaptation while be- 
ing restored near the ends of adaptation cycles. 
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Figure 5: Average fit parameters after adaptation with targeted 
exponents 7f G [2.1,3.5] for 5000 node Erdos/Renyi 
(ER) and Barabasi/ Albert (BA) networks with roughly 
25000 edges 



5. Related Work 

During the last couple of years, a number of distributed 
approaches to the construction, maintenance and adaptation 
of probabilistically structured overlay topologies have been 
proposed. Here we briefly summarize a selection of ap- 
proaches that are related to the present article. The use of 
random walks for the sampling of random participants in 
P2P overlays with good expansion (and thus Markov con- 
vergence) properties has been proposed in [ 18 36 1. In par- 
ticular, in ll36l the use of biased random walks for a non- 
uniform random sampling of Peers is studied and analyti- 
cal arguments for their convergence behavior are given. As 
argued in section l2j a similar random walk strategy consti- 



tutes the foundation for the adaptation scheme presented in 
this article. We finally emphasize that different approaches 
to a random sampling in P2P networks have been proposed 
as well, Uke for example the gossip-based topology man- 
agement scheme considered in |22|. To date, it is however 
unclear how such alternative sampling mechanisms could 
be used in our particular scenario. 

Considering the problem of creating and adapting over- 
lay networks with scale-free characteristics in a distributed 
fashion, it has been argued e.g. in ||251 that the degree distri- 
bution exponent of scale-free networks can be tuned by ad- 
justing the connection preferences of joining nodes. While 
this constitutes the basis for an adaption of growing net- 
works, it remains unclear how the existing theoretical mod- 
els can be implemented efficiently in practice. Consider- 
ing practical networked systems, in EOl distributed strate- 
gies for the creation of scale-free overlays with connectiv- 
ity cutoffs based on capacity constraints have been consid- 
ered. Since the adaption of the degree distribution exponent 
does also change the maximum degree in the network, the 
schemes presented in ||20]| - although different in nature and 
intention - can be viewed as being related to the scheme pre- 
sented in the present article. Finally, the problem of adapt- 
ing the degree distribution exponent in scale-free overlay 
networks has been considered in own previous work ||3TI . 
However, in contrast to the protocol presented in the present 
article, no analytical arguments for the functioning of the 
scheme considered in OTl as well as its precise effects on 
the degree distribution exponent could be given. As such, 
the protocol can be viewed as a companion scheme to the 
distributed power law monitoring mechanism presented in 

EH. 



6. Conclusion 

In this article, a simple adaptation protocol has been pre- 
sented that is based on a rewiring strategy and the sam- 
pling of random nodes by means of a biased random walk. 
The protocol can be used to effectuate randomize scale- 
free overlay networks with adjustable degree distribution 
exponent and thus a tunable heterogeneity in connectiv- 
ity. Apart from adapting the degree distribution exponent 
in scale-free networks, it is further suitable to transform ar- 
bitrary connected topologies to scale-free networks given 
that the expansion properties of the initial topology provide 
sufficiently fast convergence of the random walk strategy. 
In Barabasi-Albert and Erdos/Renyi networks, the random 
walk length required to provide sampling probabilities that 
are acceptably close to the stationary limit are found to be 
significantly smaller than theoretical upper bounds. Based 
on empirical findings, we argue that the proposed proto- 
col is thus a practicable approach to adapt probabilistically 
structured overlays for large-scale P2P systems. The perfor- 
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Figure 3: Time Evolution of 5000 node Barabasi/ Albert (a-c) and Erdos/Renyi (d-f) networks during adaptation nins with "f £ [2.1, 3.5] 
and / = 20 



mance of the protocol benefits from the fact that rewiring 
operations are preferentially started by high degree nodes 
as well as the observed superior convergence behavior of 
short-length random walks being started in hub nodes. In a 
future iteration of the protocol, it thus seems to be reason- 
able to choose the length / of each random walk individually 
based on the degree of the node initiating it. A further po- 
tential improvement is the use of two-stage random walks 
which - in a first stage - preferentially move to hub nodes, 
and then - in a second stage - switch their bias to sample 
nodes according to the desired stationary distribution. 

We conclude this article, by summarizing the main 
threats to validity and open issues. An important aspect in 
any practical application of the proposed scheme is the fact 
that a sufficiently efficient implementation of the scheme 
requires to accept moderate total variation distances. While 
this allows to keep the message overhead in an acceptable 
range, it limits the randomness of the resulting network 
topology. While small total variation distances suggest that 
the resulting deviation of properties from those of truly ran- 
dom networks are rather moderate, a further investigation 



of these effects is an open issue. Furthermore, although 
we have argued that simulations are a reasonable approach 
to establish empirical bounds on the required random walk 
length, the range of network sizes and topologies consid- 
ered so far is fairly limited. A study of further topologies 
must thus be considered future work. Finally - and prob- 
ably most important - the present implementation assumes 
per-node weights that are either based on fixed IDs or val- 
ues chosen uniformly at random. An important next step is 
to extend the proposed scheme in way that characteristics 
like bandwidth capacity, uptime, processing power etc. of 
individual nodes are considered. While we believe that this 
can be done in a distributed fashion, the necessary adjust- 
ments of the proposed protocol must be considered future 
work. Due to these limitations, in its current state the pro- 
posed protocol is merely a first step towards practical Peer- 
to-Peer systems with scale-free overlay topologies that can 
efficiently be adapted in a directed and distributed fashion. 
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Figure 4: Time Evolution of 10000 node Barabasi/ Albert network during multiple adaptation cycles with 70 
7108000 = 3.5. Start times of adaptation cycles are indicated by vertical lines. 
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